tidyverse
These materials are based on the APS's “R for Plant Pathologists”, a more comprehensive
workshop available here
Performance: stable, light and fast
Support network: documentation, community, developers
Reproducibility: anyone anywhere can reproduce results
Versatility: unified solution to almost any numerical problem and graphical capabilities
Ethics: accessible to anyone as it is free and open source
Transition from “point and click” is tough but rewarding
Help:
Learning:
Cheatsheets → https://rstudio.com/resources/cheatsheets/
R – programming language for statistical computing, data manipulation, and graphics
/http://www.r-project.org/
RStudio – Integrated Development Environment (IDE) makes our life much easier
R – Engine
RStudio – Dashboard
Tip for later:
Objects, where the data is stored.
x <- 1
y <- 2
x + y
[1] 3
the same result if:
1+3
[1] 4
Objects, where the data is stored.
Data is assigned using <-
x <- 1
y <- 2
x + y
[1] 3
the same result if:
1+3
[1] 4
Functions are applied on these objects to analyze the data.
# I am a comment!!! Just here to help jog the memory later on...
# Let us make a function!
addition <- function(argument_one,
argument_two){
argument_one + argument_two # operations
} # curly brackets define operations
ls() # check content of the environment
[1] "addition" "x" "y"
addition(argument_one = x,
argument_two = y)
[1] 3
Objects, where the data is stored.
Data is assigned using <-
x <- 1
y <- 2
x + y
[1] 3
the same result if:
1+3
[1] 4
Functions which are applied on objects (i.e. to analyze the data)
addition <- function(argument_one, argument_two){
argument_one + argument_two
}
addition(argument_one = x,
argument_two = y)
[1] 3
# Notice the difference?!
addition(x, y)
[1] 3
addition(x, y) == x+y #notice double "="
[1] TRUE
all.equal(addition(x, y), x+y) #Same as above, but pre-made
[1] TRUE
Vectors store data of the same type
(a column of an excel table)
num <- c(50, 60, 65)
char <- c("mouse", "rat", "dog")
fct <- factor("low", "med", "high")
dates <- as.Date(c("02/27/92", "02/27/92", "01/14/92"), "%m/%d/%y")
logical <- c(FALSE, FALSE, TRUE) # only TRUE or FALSE
Vectors store data of the same type
(a column of an excel table)
num <- c(50, 60, 65)
char <- c("mouse", "rat", "dog")
fct <- factor("low", "med", "high")
dates <- as.Date(c("02/27/92", "02/27/92", "01/14/92"), "%m/%d/%y")
logical <- c(FALSE, FALSE, TRUE) # only TRUE or FALSE
Subsetting and Indexing
num[1] # 1st element
[1] 50
num[num >= 60] # More than or equal
[1] 60 65
char == "dog" # see logical on the left
[1] FALSE FALSE TRUE
char[logical]
[1] "dog"
char[char == "dog"]
[1] "dog"
Dataframe is a set of vectors of same length(an entire excel table)
df <- data.frame(
col_one = num,
col_two = char
)
print(df)
col_one col_two
1 50 mouse
2 60 rat
3 65 dog
head(df,1)
col_one col_two
1 50 mouse
Same logic for indexing, just in 2 dimensions
df[1, 1] # [rows, columns]
[1] 50
df[, 1] # 1st column in the data frame
[1] 50 60 65
df[, -2] # Exclude 2nd column
[1] 50 60 65
df[2:3, "col_two"]
[1] "rat" "dog"
df$col_two
[1] "mouse" "rat" "dog"
Pre-made set of functions for common (and not so common) tasks
Another level: A package of packages
Think something like Microsoft Office suite